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Populations of mammalian stem cells commonly exhibit considerable cell-cell variability. However, 
the functional role of this diversity is unclear. Here, we analyze expression fluctuations of the stem 
cell surface marker Seal in mouse hematopoietic progenitor cells using a simple stochastic model and 
find that the observed dynamics naturally lie close to a critical state, thereby producing a diverse 
population that is able to respond rapidly to environmental changes. We propose an information- 
theoretic interpretation of these results that views cellular multipotency as an instance of maximum 
entropy statistical inference. 

PACS numbers: 87.10.Vg, 89.70.-a, 89.70.Cf, 87.18.Tt, 87.10.Mn, 87.17.Aa 


Clonal populations of unicellular organisms often ex¬ 
hibit phenotypic diversity, which confers selective ad¬ 
vantage under adverse environmental conditions. Well- 
known examples include antibiotic bacterial persistence, 
the lysis-lysogeny switch of A-phage, competence devel¬ 
opment and sporulation of B. subtilis, and lactose uptake 
by E. coll [1]. The ubiquity of this phenomenon indi¬ 
cates that it is a generic, evolvable, mechanism that fa¬ 
cilitates collective cellular dynamics by enabling robust, 
rapid responses to diverse environmental changes. Re¬ 
cently, stochastic fluctuations in expression of important 
marker proteins have been seen to generate functional 
diversity within multipotent mammalian stem cell pop¬ 
ulations, suggesting a similar role for cell-cell variability 
in higher organisms [2]. These observations have moti¬ 
vated speculation that functional multipotency (the abil¬ 
ity to differentiate along a number of distinct cellular 
lineages) is a collective property of stem and progenitor 
cell populations, reflective of fitness constraints imposed 
at the population, rather than individual cell, level [3]. 
This perspective is appealing since such regulated cell¬ 
cell variability in principle allows a cellular population 
to remain primed to respond quickly to a range of dif¬ 
ferent differentiation cues while remaining robust to cell 
loss. However, convincing demonstrations of the potency 
of individual stem cells appear to argue strongly against 
such a collective view (for example, single long-term re¬ 
populating hematopoietic stem cells are able to fully re¬ 
constitute the blood system of lethally irradiated adult 
mice and small numbers of pluripotent stem cells are able 
to rescue development of genetically compromised em¬ 
bryos HD- Thus, it is still unclear how population-level 
and cell-intrinsic regulatory programs interact to control 
mammalian stem and progenitor cell dynamics. 

Here we propose a theoretical framework that recon¬ 
ciles these disparate observations, which views cellular 
multipotency as an instance of maximum entropy sta¬ 
tistical inference. In this view, individual cells satisfy 
any minimal regulatory constraints imposed upon them 


(such as basic metabolic requirements, etc.) yet, in the 
absence of defined instructions, are maximally noncom¬ 
mittal with respect to their remaining molecular iden¬ 
tity, thereby generating a diverse population that is able 
to respond optimally to a range of unforeseen future 
environmental changes. Thus, rather than viewing the 
multipotent cell state as an attractor of the underlying 
molecular regulatory dynamics (i.e. associating cellular 
identities with well-defined, stable, patterns of gene ex¬ 
pression - a common modeling assumption, that has re¬ 
ceived some experimental validation for differentiated cell 
types 0 ), individual multipotent cells are characterized 
by fundamental uncertainty in their molecular state and 
their populations exhibit variability in accordance with 
this intrinsic uncertainty. However, since this model ex¬ 
changes the attractor hypothesis at the single cell level 
for an ergodicity assumption for the underlying stochas¬ 
tic processes, each individual cell has the latent poten¬ 
tial to assume every identity within the population, and 
thereby retains the regenerative capacity of the entire 
population. As this view is fundamentally stochastic, its 
corollary is that regulation of multipotency occurs at the 
level of probabilities (i.e. at the population level), rather 
than at the individual cell level. 

In order to illustrate this perspective we consider here 
the expression dynamics of the stem cell surface marker 
Seal (stem cell antigen 1) in populations of multipotent 
EML mouse hematopoietic progenitor cells. It has previ¬ 
ously been shown that Seal levels fluctuate stochastically 
in EML cells in culture, with extrinsic ‘transcriptome¬ 
wide’ noise driving transitions between Seal high and 
Seal low states, which transiently prime individual cells 
for erythroid and myeloid differentiation respectively and 
generate a characteristically bimodal Seal expression dis¬ 
tribution within the population (see Fig. bottom panel 
and Ref. i)- However, the underlying mechanisms by 
which these stochastic fluctuations are regulated are not 
known. In the absence of this knowledge we assume here 
that the intracellular dynamics of the Seal expression 
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level z{t) are described by a generic stochastic differen¬ 
tial equation: 


— = a{z) + y/2d{z)^{t), 


where ^{t) is a standard one-dimensional white noise pro¬ 
cess [(^(t)) = 0 and = S(t — s)] and d(z) ac¬ 

counts for fluctuations in Seal levels due to both intrin¬ 
sic sources (i.e. noise in the molecular processes involved 
in Seal production/decay, such as transcription, trans¬ 
lation, translocation and degradation etc.) and extrinsic 
sources (i.e. fluctuations in upstream regulators and un¬ 
controlled environmental noise). Rather than model Seal 
levels directly it is convenient to introduce a reaction co¬ 
ordinate x{z) such that the Fokker-Planck equation for 
the probability density p(x,t) has the form 


dp 


L[p), L{p) 


d / dtp \ 
dx \da; 


d'^p 
dx"^ ’ 


( 1 ) 


with scalar potential tp{x) and diffusion coefficient tr. 
Such a transformation, which maps the original dynam¬ 
ics to those of a Brownian particle in a one-dimensional 
potential field, may be achieved by application of Ito’s 
lemma (see Supplemental Material for details). The sta¬ 
tionary solution of Eq. Q is the Boltzmann-Gibbs dis¬ 
tribution 


p^{x) = Z ^ exp{-tp/a), 


Z = y eyip{—tp/cj)dx. (2) 


This solution exists so long as tp{x) grows sufficiently 
rapidly as |a;| —>■ oo that the partition function Z remains 
finite. In this case, the dynamics are ergodic and the free 
energy 

F{p) = j tppdx-\-a J plogpda;, 

= E{p) - aS{p), 

where E{p) and S{p) are the energy and entropy func¬ 
tionals respectively, is a Lyapunov functional for the dy¬ 
namics. Thus, in order to model Seal dynamics phe¬ 
nomenologically we need only chose an appropriate reac¬ 
tion coordinate x and form for the potential tp(x). 

Since noise in protein expression often scales with 
abundance, a natural choice for the reaction coordinate is 
x = logz, as has been taken elsewhere (see Supplemen¬ 
tal Material for details) [1]. In the absence of detailed 
information on how Seal fluctuations are regulated, the 
potential tp{x) may be estimated numerically from the 
empirical Seal distribution by inverting Eq. ([^. The 
model then has a single free parameter, the diffusion co¬ 
efficient cr, which sets the timescale for the dynamics. 

Estimates of cr and tp{x) were obtained by model fitting 
using maximum likelihood estimation to evolving Seal 
expression distributions obtained experimentally using 
flow cytometry starting from pre-selected populations of 
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Figure 1. Model fit to experimental data. Model simulations 
using the same estimates of tp(x) and cr are shown against the 
three independent experimental time-series; simulations differ 
only in the experimentally prescribed initial conditions. Data 
is in dark red and the fitted model is in black. The potential 
tp{x) was estimated numerically via Eq. (§ using aggregated 
data from the flnal time point. Color online. 


Seal low, mid, and high expressing cells as they equi¬ 
librate in culture over a period of 18 days (obtained in 
Ref [B]). Despite the simplicity of this model, an ex¬ 
cellent agreement with the experimental time-series data 
was observed from all three initial conditions, using the 
same numerically estimated potential and the same esti¬ 
mate of a (Figs. Hi¬ 
lt has previously been argued, based upon analysis of 
changing proportions of cells in the Seal high and low 
states, that the observed dynamics are characterized by 
slow ‘sigmoidal’ relaxation towards the stationary state 
[6] . Since a constant probability flux across a barrier nat¬ 
urally leads to exponential relaxation, it was suggested 
that these dynamics indicate deviation from expected 
first-order kinetics, possibly due to regulation of Seal 
fluctuations by cell-cell communication or autocrine sig- 
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the FPT distributions Fx{x, t) = —dG/dt for X G {L, H} 
may be obtained. Conventionally, the fpt distribution 
Fx{x,t) is evaluated from the local minima xx of ^/’(a;) 
in X, since this is the state of highest probability. Al¬ 
ternatively, we can weight each initial position within X 
according to the probability that the cell is at this po¬ 
sition at equilibrium. We thus define the expected fpt 
distribution with respect to the Gibbs measure, 


Figure 2. (Left) Convergence to equilibrium with respect to 
the free energy. Exponential convergence was observed from 
all three initial conditions for large time, in accordance with 
Eq. (fy. First passage time (fpt) distributions in the Seal 
low (Middle) and high (Right) states. The fpt distributions 
Fx{xx,t) starting at the local minima of the potential '^’{x) 
are shown in black; the expected fpt distributions {Fx}{t) 
averaging over all initial conditions in A £ {L,H} are shown 
in blue. Color online. 


naling. However, it is apparent that such recourse is not 
needed since in all 3 cases the experimental system is 
initially far from equilibrium, and therefore far from the 
regime in which first-order kinetics apply. Rather, in ac¬ 
cordance with standard reaction-rate theory, the dynam¬ 
ics are characterized by an initial transient period during 
which local equilibrium is first established within each 
potential well, before transitions between wells occur [8]. 
Examination of the free energy (which is a natural way 
to assess convergence to equilibrium 0 ) shows that this 
separation of timescales naturally generates the observed 
convergence dynamics without the need to include addi¬ 
tional regulatory mechanisms in the model (see Fig. 
left). These results indicate that the observed Seal ex¬ 
pression dynamics are well described by a simple ergodic 
process in which individual cells behave independently 
with respect to Seal fluctuations. 

This ergodic property is useful since it allows infer¬ 
ence of the behavior of individual cells from the popula¬ 
tion dynamics. While stochastic excursions into the Seal 
high and low states have previously been seen to tran¬ 
siently confer different lineage biases to individual pro¬ 
genitor cells in culture, the timescales upon which these 
excursions occur at the single cell level are not known. 
Thus, the distribution of first passage times (fpts) out 
of the Seal low and high states are of particular interest. 
Defining the ranges of Seal low and high expression as 
L = (—oo^xq) and FI = (a;o,oo) respectively, where xq is 
the intermediate maxima in 'ip(x), the fpt T{x) out of 
X for a cell initially at x G X (where X G {L,JI}) may 
be obtained from the backward Fokker-Planck equation 
associated with Eq. ([^. Denoting G{x,t) = P{T{x) > t) 
we solve: 

^ ^ d'^G 

dt dx dx dx"^ ’ 


{Fx)it)=[ Fx{x,t) dx, 

where wx = Xcgx ^ weight of 

the population in X. Numerical approximations to 
Fx{xx^t) and {Fx){t) are shown in Fig. These distri¬ 
butions yield mean fpts of 60/56 hours for the low state 
and 1573/1487 hours for the high state using Fx{xx,t) 
and {Fx)it) respectively. These timescales are substan¬ 
tially longer than the EML cell cycle time (approx. 10 -14 
hours m), and therefore suggest that Seal fluctuations 
are not simply a consequence of the cell-cycle. Rather, 
by setting the expected length of time that a pair of cells 
initially at the same position (e.g. daughter cells from 
the same cell division) will forget their common origin - 
and therefore the expected length of time that their iden¬ 
tities will be coupled - Seal switching appears to encode 
an elementary form of epigenetic memory that endows in¬ 
dividual cells with a transient functional identity. Since 
the rate of switching is slower than the rate of cell di¬ 
vision this allows the formation of communities of cells 
that maintain the same characteristics though divisions, 
and are therefore able to adopt a temporarily stable func¬ 
tional phenotype. Yet, by allowing mixing between the 
communities on a feasible time-scale. Seal fluctuations 
also safeguard long-term cell-cell variability and ensure 
that a robustly heterogeneous population, able to rapidly 
respond to a range of environmental challenges and re¬ 
silient to removal of cellular sub-types, is maintained. 

These results indicate that regulated fluctuations in 
Seal levels may be an intrinsic feature of EML cells in 
culture since they provide a mechanism by which the 
population hedges against unforeseen future environmen¬ 
tal challenges and thereby retains the capacity to differ¬ 
entiate along both erythroid and/or myeloid lineages as 
required. If this is the case, then it is natural to ask if 
the experimentally observed stationary Seal distribution 
is optimal for this purpose; that is, if it is maximally vari¬ 
able in some appropriately defined way. To investigate 
this, it is convenient to introduce a parameterization of 
the potential 'il’(x), in order to compare distributions. A 
parsimonious model, which allows for observed bimodal¬ 
ity without introducing large numbers of parameters, is: 


dip 

dx 


= Px — ao 


aix'^ 

AT" -k x" ’ 


with initial conditions G(x,0) = 1 for x G X and bound- where n is a positive even integer |llj . Intuitively, this 
ary conditions G(a;o: i) = 9G/9a;(±oo, t) = 0, from which is a simple model of a positive-feedback based bistable 
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Figure 3. (Left) Entropy of the stationary distribution rela¬ 
tive to the maximum entropy distribution over the ay-plane. 
The empirical distribution is marked with a magenta cross 
and the maximum entropy distribution p^^{x) is marked 
with a green circle. Color shows percentiles. (Right) Min¬ 
imum MFPT T in the vicinity of the maximum entropy distri¬ 
bution (close-up over the dashed box in the left panel). The 
critical lines separating the bistable and monostable regimes 
are shown in blue. The empirical distribution lies in the small 
region of the ay-plane that is both close to critical and of high 
entropy. Color shows dimensionless time. Color online. 


switch of the kind that commonly regulate cell fate 
changes [T2]. The stationary distribution Poo{x) is then 
characterized by four nonnegative dimensionless param¬ 
eters: 0 — [n, a = a^jai, y = j5K/ai, dd = o'/S/a^]. 

For fixed 0, the conditional probability Poo{x \ 9) is the 
minimizer of the free energy F(p), and may therefore be 
viewed as the most non-committal way to assign proba¬ 
bilities subject to the particular constrains imposed upon 
the dynamics by 6) (i.e. an instance of maximum 
entropy statistical inference) [13]. As each set of model 
parameters defines a different potential, which places dif¬ 
ferent constraints upon the dynamics, we may therefore 
determine the extent to which Seal fluctuations opti¬ 
mize population diversity by assessing the proximity of 
the empirical stationary Seal distribution to the maxi¬ 
mum entropy distribution p^^(x) = Poo(x\6*), where 
S{poo{x\9*)) = maxe S'(poo(a; I 0))- The relative en¬ 
tropy, 

^(Poo II p““) = /poo log dx, 

is a natural way to measure this proximity. Since the Hill 
coefficient n is, informally, a measure of the sensitivity of 
the underlying switch to input stimulus, it primarily af¬ 
fects the curvature of the potential around the local min¬ 
ima xq (where present) and does not have a strong effect 
on the entropy. However, by governing a cusp bifurca¬ 
tion that determines whether the underlying switch is in 
a monostable or bistable state, a and y can affect the 
entropy of the stationary distribution considerably. Fig. 

shows how the relative entropy of Pao{x) varies over 
the biologically relevant bistable region of the ay-plane 
M- It can be seen that the point estimate for the exper¬ 
imentally observed Seal distribution is remarkably close 


to the maximum entropy distribution p^^(x). However, 
while the maximum entropy distribution is in the center 
of the bistable regime, the empirical distribution is close 
to one of the critical lines that separate the bistable and 
monostable regimes (shown in blue in Fig. right). It 
has long been suggested that such criticality may emerge 
naturally in biological systems via self-organizing evolu¬ 
tionary processes without the need for fine-tuning (i.e. as 
an attractor of the evolutionary dynamics) since critical 
states provide the dual benefits of stability and adaptabil¬ 
ity m- Here, proximity to criticality specifically regu¬ 
lates the rate of mixing between the Seal high and low 
subpopulations, and therefore the response time of the 
population to environmental changes. To illustrate this. 
Fig. 3 also shows how r = min [r_, r+j, where t_ and t_|_ 
are the mean first passage times (mfpts) in the low and 
high states respectively, varies in the vicinity of the max¬ 
imum entropy state in the ay-plane. It can be seen that 
the minimum MFPT in the maximum entropy state is ap¬ 
proximately an order of magnitude greater than that of 
the empirical distribution. Thus, while a population dis¬ 
tributed according to the maximum entropy distribution 
would ultimately able to adapt better to environmental 
changes than the empirical population, it could not do 
so as rapidly. In this regard, close proximity to criti¬ 
cality is vital since it ensures that a diverse population 
is produced, yet mixing between subpopulations occurs 
on a physically relevant time-scale. These results sug¬ 
gest that Seal levels are regulated by htness constraints 
that involve a trade-off between maximizing cell-cell vari¬ 
ability and maintaining the ability to respond rapidly to 
environmental changes. 

In summary, we have proposed an information- 
theoretic interpretation of stem cell dynamics that views 
cellular multipotency as an instance of maximum entropy 
statistical inference. Although we have focused on Seal 
dynamics, comperable expression fluctuations are known 
to generate functional diversity in other mammalian stem 
cell systems HI US], and similar ergodic dynamics have 
been observed to give rise to universal protein expression 
distributions in microorganisms m- Thus, the genera¬ 
tion of ergodic expression fluctuations may be a generic 
way in which cell populations maintain robust multilin¬ 
eage differentiation potential under environmental uncer¬ 
tainty. If so, then molecular noise processing could be 
particularly important in regulating stem cell function 
in a range of contexts. A better understanding of the 
relationship between molecular noise and stem cell iden¬ 
tity should help to distinguish variability due to inter¬ 
changeable subpopulations of cells from that due to the 
presence of distinct, non-interconvertible, cell types (i.e. 
to determine which underlying stochastic processes are 
ergodic) [15]. We anticipate that advances in single cell 
profiling techniques will help to address these issues in 
the near future. 
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Supplementary Material 

We assume that the intracellular dynamics of Seal expression level z{t) are given by the following generic stochastic 
differential equation: 


— = o(z) + ^2d{z)^(t), 

where ^{t) is a standard one-dimensional white noise process [(C(i)) = 0 and (C(i)^(s)) = 5{t — s)] and d{z) accounts 
for fluctuations in Seal levels due to both intrinsic and extrinsic sources. This equation can be written in a more 
convenient form by introducing an appropriate reaction coordinate x{z) such that the dynamics are mapped to those 
of a Brownian particle in a one-dimensional potential field. Such a transformation may be achieved by application of 
Ito’s lemma, which reads: 


dx 

dt 



+ d{z) 


d^x 




Firstly, the reaction coordinate x{z) can be chosen such that the noise term in this equation is constant, say 
which gives the transformation 


X = 


I 



Since the dynamics are one-dimensional we may also introduce a potential ijj{x) such that 


d-ip 

da; 


da; 

“<=>E 


-b d{z) 


d^a; 
dz^ ’ 


(S3) 


to obtain 


dx 

dt 


d^ 

dx 


-b 


which is the stochastic differential equation corresponding to the Fokker-Planck equation given in the main text. 
Experimental data suggests that protein expression fluctuations often scale linearly with expression level [T]. Thus, 
a natural choice for the noise term is d{z) = Substituting this into Eq. (S31 gives x = log(z). This approach is 
similar to that taken in Ref. [1]. 
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